Too little power for small sample sizes and too much power for large sample sizes
Very sensitive to normality assumption
The equal variance assumption is important mostly for unequal sample sizes.
Assumptions, 3 of 3
Observations are independent
Between groups
No matching
No longitudinal measures
Within groups
No cluster effects
No infectious spread
Assessed qualitatively
Housing data dictionary, 1 of 5
source:
This file was found originally at a website
DASL (Data And Story Library) that is no
longer available.
description:
The original source describes the data as
"a random sample of records of resales of
homes from Feb 15 to Apr 30, 1993 from the
files maintained by the Albuquerque Board
of Realtors. This type of data is
collected by multiple listing agencies in
many cities and is used by realtors as an
information base."
Housing data dictionary, 2 of 5
copyright:
Unknown. You should be able to use this data for
individual educational purposes under the Fair Use
guidelines of U.S. copyright law.
format:
delimiter: space
varnames: first row of data
missing-value-code: *
rows: 117
columns: 8
Housing data dictionary, 3 of 5
vars:
Price:
label: Selling price
unit: dollars
SquareFeet:
label: Living space
unit: square feet
AgeYears:
label: Age of home
unit: years
Housing data dictionary, 4 of 5
NumberFeatures:
label:
Home features (dishwasher, refrigerator,
microwave, disposer, washer, intercom,
skylight(s), compactor, dryer, handicap
fit, cable TV access)
scale: count
range: 0 to 11
Northeast:
label: Located in northeast sector of city?
values:
Yes: 1
No: 0
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean of only 6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
Power calculation, if possible
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean of only 6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
Alpha level
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean of only 6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
Sample statistics for each group
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean ofonly6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
P-value AND confidence interval
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean of only 6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
One or two sided test
“We designed the study to have 90% power to detect a 4 degree difference between the groups in the increased range of elbow flexion. Alpha was set at 0.05. Patients receiving electrical stimulation (n=26) increased their range of elbow flexion by a mean of 16 degrees with a standard deviation of 4.5, whereas patients in the control group (n=25) increased their range of flexion by a mean of only 6.5 degrees with a standard deviation of 3.4. This 9.5 degree difference was statistically significant (95% CI = 7.23 to 11.73, two-tailed Student’s t test, t=8.43, P < 0.001).”
Lang and Secic, How to Report Statistics in Medicine, p47.
My interpretation
“Comparison of housing prices between groups used a two-sample t-test.”
“All tests were conducted using a two-sided alpha level of 0.05.”
“Houses that were custom built had an average price of 145 thousand dollars, which was 50 thousand dollars higher than the average price of normal houses. This difference was statistically significant (95% CI 36 thousand dollars to 64 thousand dollars, p=0.001).”
What I look for
The test you are using
Specify the number of tails
Round
Units of measure
Compare means dialog box
Compare means output
Geometric means
Geometric standard deviations
General linear model for log transformed data
Back-calculated statistics
Back calculated confidence intervals, 1 of 2
Back calculated confidence intervals, 2 of 2
Conceptual formula for sample size justification, 1 of 2
Conceptual formula for sample size justification, 2 of 2
Moon data dictionary, 1 of 4
---
data_dictionary: moon.txt
source:
This data file is part of OzDASL, an archive of various data sets
useful for teaching. The maintainers of the data archive are from
Australia, but this particular data set is not specific to that part
of the world. The entire archive is at
https://dasl.datadescription.com/
Moon data dictionary, 2 of 4
description:
This data set shows a perceptual experiment where subjects were asked
to estimate a size ratio with their head level to the ground and then
with their head elevated (in other words, looking upward). Although
the objects being compared were the same size, almost all subjects
overestimated the relative sizes. The hypothesis to be tested is
whether the overestimation is greater with eyes level than with eyes
elevated. A more detailed description is available at
https://gksmyth.github.io/ozdasl/general/moon.html
download:
https://gksmyth.github.io/ozdasl/general/moon.txt
Moon data dictionary, 3 of 4
copyright:
Unknown. You should be able to use this data for individual
educational purposes under the Fair Use guidelines of U.S.
copyright law.
format:
delimiter: tab
varnames: first row of data
missing-value-code: not needed
rows: 10
columns: 3
Moon data dictionary, 4 of 4
vars:
Subject:
label: Subject number
format: numeric
Elevated:
label: Perceived ratio with eyes elevated
format: numeric
Level:
label: Perceived ratio with eyes level
format: numeric
---
QQ plot of ratio under elevated eye condition
QQ plot of ratio under level eye condition
QQ plot of differences
Moon data, descriptive statistics
Paired Samples T Test dialog box
Paired T Test output, 1 of 5
Paired T Test output, 2 of 5
Paired T Test output, 3 of 5
Paired T Test output, 4 of 5
Paired T Test output, 5 of 5
My interpretation
“Comparison of viewing ratios between the eyes level and eyes elevated conditions used a paired t-test.”
“All tests were conducted using a two-sided alpha level of 0.05.”
“Viewing ratios under both conditions had comaparable sample means and standard deviations. The mean difference was 0.02 with a standard deviation of 0.14. This difference was not statistically significant (95% CI: -0.08 to 0.12, p=0.67).
Three things you need for a sample size justification
Hypothesis
Standard deviation
Minimum clinically important difference
Rule of 16
n per group = 16 / ES^2
Examples:
ES = 0.5, n per group = 64
ES = 0.1, n per group = 1,600
Sample size calculation dialog box
Sample size calculation output
Sample size calculation output for second scenario